Annotation and differential analysis of alternativesplicing using de novo assembly of RNAseq data

نویسندگان

  • Clara Benoit-Pilven
  • Camille Marchet
  • Emilie Chautard
  • Leandro Lima
  • Marie-Pierre Lambert
  • Gustavo Sacomoto
  • Amandine Rey
  • Cyril Bourgeois
  • Didier Auboeuf
  • Vincent Lacroix
چکیده

Genome-wide analyses reveal that more than 90% of multi exonic human genes produce at least two transcripts through alternative splicing (AS). Various bioinformatics methods are available to analyze AS from RNAseq data. Most methods start by mapping the reads to an annotated reference genome, but some start by a de novo assembly of the reads. In this paper, we present a systematic comparison of a mapping-first approach (FaRLine) and an assembly-first approach (KisSplice). These two approaches are event-based, as they focus on the regions of the transcripts that vary in their exon content. We applied these methods to an RNAseq dataset from a neuroblastoma SK-N-SH cell line (ENCODE) differentiated or not using retinoic acid. We found that the predictions of the two pipelines overlapped (70% of exon skipping events were common), but with noticeable differences. The assembly-first approach allowed to find more novel variants, including novel unannotated exons and splice sites. It also predicted AS in families of paralog genes. The mapping-first approach allowed to find more lowly expressed splicing variants, and was better in predicting exons overlapping repeated elements. This work demonstrates that annotating AS with a single approach leads to missing a large number of candidates. We further show that these candidates cannot be neglected, since many of them are differentially regulated across conditions, and can be validated experimentally. We therefore advocate for the combine use of both mapping-first and assembly-first approaches for the annotation and differential analysis of AS from RNAseq data. 2 . CC-BY-NC-ND 4.0 International license not peer-reviewed) is the author/funder. It is made available under a The copyright holder for this preprint (which was . http://dx.doi.org/10.1101/074807 doi: bioRxiv preprint first posted online Sep. 12, 2016;

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Clustering of Short Read Sequences for de novo Transcriptome Assembly

Given the importance of transcriptome analysis in various biological studies and considering thevast amount of whole transcriptome sequencing data, it seems necessary to develop analgorithm to assemble transcriptome data. In this study we propose an algorithm fortranscriptome assembly in the absence of a reference genome. First, the contiguous sequencesare generated using de Bruijn graph with d...

متن کامل

Arkas: Rapid reproducible RNAseq analysis

The recently introduced Kallisto pseudoaligner has radically simplified the quantification of transcripts in RNA-sequencing experiments.  We offer cloud-scale RNAseq pipelines Arkas-Quantification, which deploys Kallisto for parallel cloud computations, and Arkas-Analysis, which annotates the Kallisto results by extracting structured information directly from source FASTA files with per-contig ...

متن کامل

Semantic Assembly and Annotation of Draft RNAseq Transcripts without a Reference Genome

Transcriptomes are one of the first sources of high-throughput genomic data that have benefitted from the introduction of Next-Gen Sequencing. As sequencing technology becomes more accessible, transcriptome sequencing is applicable to multiple organisms for which genome sequences are unavailable. Currently all methods for de novo assembly are based on the concept of matching the nucleotide cont...

متن کامل

P-70: Evidence for Differential Gene Expression of A Major EpigeneticModifier Enzyme, de novo DNA Methyltransferase 3b, through Vitrification of Mouse Ovary Tissue

Background: Ovarian tissue cryopreservation is a feasible method to preserve female reproductive potential, especially in young patients with cancer or in women at risk of premature ovarian failure. Vitrification has recently emerged as a new trend for biological specimen preservation. On the other hand, gene expression that changes during vitrification can influence oocyte maturation and need ...

متن کامل

Pathway Analysis of miRNA-1 and Its Expres-sion Evaluation in Donor’s Serum from HIV-Positive Individuals vs Unaffected Controls

Background MicroRNAs (miRNAs) are non-coding RNA molecules (19-24 nucleotides) that play a major role in a wide range of biological processes through post-transcriptional regulation of gene expression. Differential expression of miRNAs has been reported in various infectious diseases such as HIV infection. The characterization of miRNA expression profiles, especially in mammalian biofluids, whi...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2016